15 research outputs found
A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks
Graph neural networks (GNNs) have become increasingly popular for
classification tasks on graph-structured data. Yet, the interplay between graph
topology and feature evolution in GNNs is not well understood. In this paper,
we focus on node-wise classification, illustrated with community detection on
stochastic block model graphs, and explore the feature evolution through the
lens of the "Neural Collapse" (NC) phenomenon. When training instance-wise deep
classifiers (e.g. for image classification) beyond the zero training error
point, NC demonstrates a reduction in the deepest features' within-class
variability and an increased alignment of their class means to certain
symmetric structures. We start with an empirical study that shows that a
decrease in within-class variability is also prevalent in the node-wise
classification setting, however, not to the extent observed in the
instance-wise case. Then, we theoretically study this distinction.
Specifically, we show that even an "optimistic" mathematical model requires
that the graphs obey a strict structural condition in order to possess a
minimizer with exact collapse. Interestingly, this condition is viable also for
heterophilic graphs and relates to recent empirical studies on settings with
improved GNNs' generalization. Furthermore, by studying the gradient dynamics
of the theoretical model, we provide reasoning for the partial collapse
observed empirically. Finally, we present a study on the evolution of within-
and between-class feature variability across layers of a well-trained GNN and
contrast the behavior with spectral methods.Comment: NeurIPS 202
Perturbation Analysis of Neural Collapse
Training deep neural networks for classification often includes minimizing
the training loss beyond the zero training error point. In this phase of
training, a "neural collapse" behavior has been observed: the variability of
features (outputs of the penultimate layer) of within-class samples decreases
and the mean features of different classes approach a certain tight frame
structure. Recent works analyze this behavior via idealized unconstrained
features models where all the minimizers exhibit exact collapse. However, with
practical networks and datasets, the features typically do not reach exact
collapse, e.g., because deep layers cannot arbitrarily modify intermediate
features that are far from being collapsed. In this paper, we propose a richer
model that can capture this phenomenon by forcing the features to stay in the
vicinity of a predefined features matrix (e.g., intermediate features). We
explore the model in the small vicinity case via perturbation analysis and
establish results that cannot be obtained by the previously studied models. For
example, we prove reduction in the within-class variability of the optimized
features compared to the predefined input features (via analyzing gradient flow
on the "central-path" with minimal assumptions), analyze the minimizers in the
near-collapse regime, and provide insights on the effect of regularization
hyperparameters on the closeness to collapse. We support our theory with
experiments in practical deep learning settings.Comment: ICML 202